Contextual Information in Semantic Space Models Beyond Words and Documents

نویسندگان

  • Marco Baroni
  • Alessandro Lenci
  • Magnus Sahlgren
  • Roskilde Denmark
چکیده

Since synonyms are important lexical knowledge, various methods have been proposed for automatic synonym acquisition. Whereas most of the methods are based on the distributional hypothesis and utilize contextual clues, little attention has been paid to what kind of contextual information is useful for the purpose. As one of the ways to augment contextual information, we propose the use of indirect dependency, i.e. relation between two words related via two contiguous dependency relations. The evaluation has shown that the performance improvement over normal direct dependency is dramatic, yielding comparable results with surrounding words as context, even with smaller co-occurrence data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sixth International and Interdisciplinary Conference on Modeling and Using Context Workshop on Contextual Information in Semantic Space Models Beyond Words and Documents

Since synonyms are important lexical knowledge, various methods have been proposed for automatic synonym acquisition. Whereas most of the methods are based on the distributional hypothesis and utilize contextual clues, little attention has been paid to what kind of contextual information is useful for the purpose. As one of the ways to augment contextual information, we propose the use of indir...

متن کامل

Learning Distributed Document Representations for Multi-Label Document Categorization

Multi-label Document Categorization, the task of automatically assigning a text document into one or more categories has various real-world applications such as categorizing news articles, tagging Web pages, maintaining medical patient records and organizing digital libraries among many others. Statistical Machine Learning approaches to document categorization have focused on multi-label learni...

متن کامل

Text Categorization with Semantic Commonsense Knowledge

Most of text categorization research exploit bag-of-words text representation. In this approach, however, all contextual information contained in text is neglected. Therefore, capturing semantic similarity between text documents that share very little or even no vocabulary is not possible. In this paper we present an approach that combines well established kernel text classifiers with external ...

متن کامل

Semantic Similarity Measure Using Relational and Latent Topic Features

Computing the semantic similarity between words is one of the key challenges in many language-based applications. Previous work tends to use the contextual information of words to disclose the degree of their similarity. In this paper, we consider the relationships between words in local contexts as well as latent topic information of words to propose a new distributed representation of words f...

متن کامل

Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model

Techniques such as probabilistic topic models and latent-semantic indexing have been shown to be broadly useful at automatically extracting the topical or semantic content of documents, or more generally for dimension-reduction of sparse count data. These types of models and algorithms can be viewed as generating an abstraction from the words in a document to a lower-dimensional latent variable...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007